question: Why can traditional methods in anomaly detection be weak in handling complex data? option 1: They don't have end-to-end optimization option 2: They don't offer options to unify anomaly detection and explanation option 3: They lack the ability to learn intricate structures and relations option 4: They can't handle diverse types of data option 5: They require large-scale labeled data 